Skip to content

fix: exclude submodules outside src folder when src is specified#1256

Merged
ben-edna merged 8 commits into
mainfrom
claude/subproject-submodules-fetch-e2y7d2
Jun 12, 2026
Merged

fix: exclude submodules outside src folder when src is specified#1256
ben-edna merged 8 commits into
mainfrom
claude/subproject-submodules-fetch-e2y7d2

Conversation

@spoorcc

@spoorcc spoorcc commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

When a subproject specifies a src directory, git submodule update --init
would clone ALL submodules (including those outside src), leaving their
directories in the destination. _apply_src_and_ignore now detects which
submodules are outside src (path unchanged by strip_glob_prefix), removes
their top-level directory, and excludes them from the returned list.

Two new BDD scenarios cover this:

  • plain src directory (non-glob) with a submodule inside
  • mixed submodules inside and outside src, only inside is fetched

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8

Summary by CodeRabbit

  • Bug Fixes

    • Only submodules inside a declared src directory are fetched; submodules outside src are excluded and src subtrees are preserved and promoted correctly.
  • Tests

    • Added feature and unit tests covering submodule discovery, pruning, promotion, and expected CLI/output structure when a src is declared.
  • Refactor

    • Improved logging to remove extra blank lines in asciinema captures and simplified license inference handling.

When a subproject specifies a src directory, git submodule update --init
would clone ALL submodules (including those outside src), leaving their
directories in the destination. _apply_src_and_ignore now detects which
submodules are outside src (path unchanged by strip_glob_prefix), removes
their top-level directory, and excludes them from the returned list.

Two new BDD scenarios cover this:
- plain src directory (non-glob) with a submodule inside
- mixed submodules inside and outside src, only inside is fetched

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8
@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b0dc437d-ba83-405d-801b-5b80a416b73e

📥 Commits

Reviewing files that changed from the base of the PR and between 84c6b76 and 3ee3ffb.

📒 Files selected for processing (5)
  • dfetch/log.py
  • dfetch/vcs/git.py
  • features/fetch-git-repo-with-submodule.feature
  • pyproject.toml
  • tests/test_git_vcs.py

Walkthrough

Centralize src-aware submodule filtering into _filter_submodules_by_src, which rewrites or prunes submodule paths, removes scheduled paths, promotes src to the repo root, and returns surviving submodules. Add feature scenarios and tests for submodules inside/outside a plain src. Also refactor Rich logging handler and license inference branching.

Changes

Submodule filtering when src is specified

Layer / File(s) Summary
Src-based submodule filtering and promotion
dfetch/vcs/git.py, features/fetch-git-repo-with-submodule.feature, tests/test_git_vcs.py
_apply_src_and_ignore delegates src-aware filtering to _filter_submodules_by_src, which uses strip_glob_prefix to rewrite paths for submodules under src, schedules/removes outside-src top-level/full paths with safe_rm (guarding shared src), removes now-empty ancestor directories, pre-removes .git/.gitmodules from a promoted chosen path, calls _move_src_folder_up, and returns surviving within_src submodules. Feature scenarios and tests validate inclusion of submodules under a plain src and exclusion of outside-src submodules.
Logging and license inference small refactors
dfetch/log.py, dfetch/util/license.py, pyproject.toml
Introduce _NoExpandRichHandler (RichHandler subclass) that sets expand=False for rich.table.Table renderables and choose it at handler construction; remove the previous post-construction LogRender mutation. Simplify guess_license_in_file by early-returning on empty probable_licenses. Add a mypy override for dfetch.log.

Sequence Diagram(s)

sequenceDiagram
  participant Apply as GitLocalRepo._apply_src_and_ignore
  participant Filter as _filter_submodules_by_src
  participant Strip as strip_glob_prefix
  participant SafeRM as safe_rm / scheduled removals
  participant Move as _move_src_folder_up
  Apply->>Filter: call(remote, src, submodules)
  loop per submodule
    Filter->>Strip: compute rewritten path under src
    alt rewritten path within src
      Filter->>Filter: update submodule.path (keep in within_src)
    else exact-match src
      Filter->>Filter: keep entry unchanged for promotion
    else outside src
      Filter->>SafeRM: schedule and remove path
    end
  end
  Filter->>SafeRM: remove now-empty parent dirs after deletions
  Filter->>Move: promote src folder to repo root
  Filter-->>Apply: return within_src (kept submodules)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • dfetch-org/dfetch#1144: Modifies _apply_move-adjacent move/cleanup logic in dfetch/vcs/git.py, related to pre-removal and move behavior.

Suggested labels

bug

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing submodule exclusion logic to prevent cloning submodules outside the specified src folder.
Docstring Coverage ✅ Passed Docstring coverage is 92.31% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/subproject-submodules-fetch-e2y7d2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dfetch/vcs/git.py`:
- Line 430: Collect unique top-level directories before calling safe_rm to avoid
redundant removals: instead of calling safe_rm(Path(submodule.path).parts[0],
within=".") for each submodule, build a set of Path(submodule.path).parts[0] (or
equivalent) for all relevant submodules and then iterate that set to call
safe_rm once per unique top-level directory; update the code around the existing
safe_rm invocation in dfetch.vcs.git to use this deduplicated collection.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0f0542b2-6e51-4798-a28b-1f31d2fac6b6

📥 Commits

Reviewing files that changed from the base of the PR and between d6006ec and ecbbebf.

📒 Files selected for processing (2)
  • dfetch/vcs/git.py
  • features/fetch-git-repo-with-submodule.feature

Comment thread dfetch/vcs/git.py Outdated
- Extract _filter_submodules_by_src from _apply_src_and_ignore to keep
  cyclomatic complexity within xenon's block-grade-B limit
- Fix mypy strict error in log.py: replace LogRender inheritance with
  composition to avoid subclassing an untyped (Any) class
- Fix mypy strict error in license.py: unpack probable_licenses[0]
  explicitly so mypy sees exactly two positional args to from_inferred,
  eliminating the spurious "multiple values for keyword argument 'text'"

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8
When multiple submodules outside src share the same top-level directory,
collect unique top-level dirs first and call safe_rm once per dir instead
of once per submodule.

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dfetch/log.py (1)

67-69: ⚠️ Potential issue | 🟡 Minor

Avoid mutating RichHandler private _log_render

File: dfetch/log.py
Lines: 67-69

The below code mutates RichHandler’s private rendering internals:

        handler._log_render = _NoExpandLogRender(  # pylint: disable=protected-access
            show_time=False, show_level=False, show_path=False
        )

Rich doesn’t provide a public API for customizing _log_render; consider refactoring to a custom RichHandler subclass that overrides render (or emit) to apply the non-expanding table behavior (avoiding the protected-access suppression), or document why no supported hook fits this asciinema-specific requirement.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/log.py` around lines 67 - 69, The code is mutating RichHandler's
private attribute handler._log_render by assigning _NoExpandLogRender; instead
create a custom subclass of RichHandler (e.g., class
NoExpandRichHandler(RichHandler)) and implement an override of render (or emit)
to return/apply the non-expanding table behavior used by _NoExpandLogRender,
then instantiate that subclass instead of assigning to handler._log_render;
replace usages that create a RichHandler with the new NoExpandRichHandler and
remove the protected-access suppression and direct assignment to _log_render in
dfetch/log.py.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dfetch/vcs/git.py`:
- Around line 442-447: The code currently treats submodules whose path exactly
equals src as outside scope because strip_glob_prefix returns the same path;
modify the logic in the block around strip_glob_prefix so you first check if
submodule.path == src and, if so, treat that submodule as in-scope by appending
it to within_src (and do NOT add it to to_remove or change its path), otherwise
continue using new_path != submodule.path to decide whether to update
submodule.path and append to within_src or add the top-level segment to
to_remove; reference functions/vars: strip_glob_prefix, submodule.path, src,
within_src, to_remove.
- Around line 447-450: The code currently adds only the first path component
(Path(submodule.path).parts[0]) to to_remove which can accidentally remove
shared ancestors like "apps" when src is nested; instead add the excluded
submodule's full path or only an ancestor proven disjoint from src. Replace the
parts[0] usage so you collect the exact excluded path (e.g. Path(submodule.path)
as a string) or, if you must remove an ancestor, test that
Path(submodule.path).parts does not share the src prefix (compare
Path(src).parts) before adding it to to_remove; then call safe_rm() with that
exact path(s) and leave _move_src_folder_up(remote, src) unchanged.

---

Outside diff comments:
In `@dfetch/log.py`:
- Around line 67-69: The code is mutating RichHandler's private attribute
handler._log_render by assigning _NoExpandLogRender; instead create a custom
subclass of RichHandler (e.g., class NoExpandRichHandler(RichHandler)) and
implement an override of render (or emit) to return/apply the non-expanding
table behavior used by _NoExpandLogRender, then instantiate that subclass
instead of assigning to handler._log_render; replace usages that create a
RichHandler with the new NoExpandRichHandler and remove the protected-access
suppression and direct assignment to _log_render in dfetch/log.py.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: c3237f98-7a13-4e72-a321-53b99878da87

📥 Commits

Reviewing files that changed from the base of the PR and between ecbbebf and 4ecbb95.

📒 Files selected for processing (3)
  • dfetch/log.py
  • dfetch/util/license.py
  • dfetch/vcs/git.py

Comment thread dfetch/vcs/git.py Outdated
Comment thread dfetch/vcs/git.py Outdated
Composition broke CI because handler._log_render is typed as LogRender
in rich's stubs there, making the assignment incompatible. Revert to
inheritance; use type: ignore[misc] to suppress the local-only error
where rich._log_render has no stubs and LogRender resolves to Any.
warn_unused_ignores=false means the comment is harmless in CI.

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8
git.py – _filter_submodules_by_src:
- Issue 1: submodule whose path exactly equals src was treated as
  outside-scope because strip_glob_prefix returns unchanged when path
  depth <= pattern depth. Now checked first; submodule is added to
  within_src without altering its path so _move_src_folder_up can
  promote it correctly.
- Issue 2: using parts[0] as the ancestor to remove could clobber the
  src tree when src is nested (e.g. src: apps/myapp, submodule: apps/other
  would incorrectly remove apps/).  Now uses Path(src).is_relative_to()
  to detect a shared ancestor and falls back to the exact submodule path;
  the shared ancestor is cleaned up automatically by _move_src_folder_up.

log.py – _NoExpandRichHandler:
  Replace _NoExpandLogRender (which mutated handler._log_render, a
  protected attribute) with a proper RichHandler subclass that overrides
  the public render() method to set table.expand = False.  configure_root_logger
  now selects the subclass via handler_class rather than post-construction
  mutation, removing the protected-access suppression entirely.

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
dfetch/vcs/git.py (1)

452-463: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't remove a submodule path that still contains the requested src.

With src="apps/foo" and a top-level submodule at apps, strip_glob_prefix("apps", "apps/foo") stays unchanged, so Line 458 takes the shared-ancestor branch and Line 459 queues safe_rm("apps"). That deletes the only tree containing apps/foo before _move_src_folder_up() runs, so the requested subtree is lost. The ancestor check needs to use the full submodule.path, and paths that are ancestors of src should be skipped here instead of removed.

Suggested fix
             else:
-                sub_top = Path(submodule.path).parts[0]
+                src_path = Path(src)
+                sub_path = Path(submodule.path)
+                sub_top = sub_path.parts[0]
                 # Only remove the top-level component when it is provably disjoint
                 # from src; if src lives under that same ancestor, removing it would
                 # clobber the src tree before _move_src_folder_up can promote it.
                 # In that case use the exact submodule path; the shared ancestor is
                 # cleaned up automatically by _move_src_folder_up.
-                if Path(src).is_relative_to(sub_top):
+                if src_path.is_relative_to(sub_path):
+                    continue
+                if src_path.is_relative_to(sub_top):
                     to_remove.add(submodule.path)
                 else:
                     to_remove.add(sub_top)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/vcs/git.py` around lines 452 - 463, The current ancestor check uses
only sub_top and thus can remove a top-level submodule that actually contains
the requested src; change the logic to test whether src is inside the full
submodule.path and skip removal in that case: instead of if
Path(src).is_relative_to(sub_top): ... use if
Path(src).is_relative_to(Path(submodule.path)): continue (or skip adding to
to_remove) so that when src is a descendant of submodule.path you do not add
either submodule.path or sub_top to to_remove; otherwise keep the existing
behavior that adds submodule.path (or sub_top) and later calls safe_rm, and
retain reference to _move_src_folder_up for the remaining cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@dfetch/log.py`:
- Line 26: Remove the inline suppressions from the class declaration of
_NoExpandRichHandler: delete the trailing "# type: ignore[misc]" and "# pylint:
disable=too-few-public-methods" comments on the class line that defines
_NoExpandRichHandler (which subclasses RichHandler), leaving a plain class
declaration; ensure no other code changes are made and run type/pylint checks
(note that the pylint warning applies to ExtLogFilter, not
_NoExpandRichHandler).

---

Duplicate comments:
In `@dfetch/vcs/git.py`:
- Around line 452-463: The current ancestor check uses only sub_top and thus can
remove a top-level submodule that actually contains the requested src; change
the logic to test whether src is inside the full submodule.path and skip removal
in that case: instead of if Path(src).is_relative_to(sub_top): ... use if
Path(src).is_relative_to(Path(submodule.path)): continue (or skip adding to
to_remove) so that when src is a descendant of submodule.path you do not add
either submodule.path or sub_top to to_remove; otherwise keep the existing
behavior that adds submodule.path (or sub_top) and later calls safe_rm, and
retain reference to _move_src_folder_up for the remaining cleanup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: eb36f474-7b8e-4742-8aef-8e4d8bc0002b

📥 Commits

Reviewing files that changed from the base of the PR and between 4ecbb95 and 36dd882.

📒 Files selected for processing (2)
  • dfetch/log.py
  • dfetch/vcs/git.py

Comment thread dfetch/log.py Outdated
…submodule ancestor check; add unit tests

- log.py: remove `pylint: disable=too-few-public-methods` from _NoExpandRichHandler (it
  inherits many methods from RichHandler so pylint never flags it); retain `type: ignore[misc]`
  which mypy requires because rich lacks complete type stubs.

- vcs/git.py: in _filter_submodules_by_src replace the fragile two-branch
  `sub_top` guard with a single `Path(src).is_relative_to(Path(submodule.path))`
  check.  The old code computed the top-level path component separately and
  could mistakenly remove a top-level submodule whose full path is actually an
  ancestor of src (e.g. submodule at "apps", src="apps/myapp").  The new code
  tests the full path and skips removal with `continue`, leaving the ancestor
  submodule in place so _move_src_folder_up can promote its content correctly.

- tests/test_git_vcs.py: add two new unit tests for _filter_submodules_by_src:
  - test_filter_submodules_ancestor_of_src_not_removed: proves that a submodule
    at "apps" is not deleted when src="apps/myapp" and that the inner content is
    promoted to root.
  - test_filter_submodules_disjoint_submodule_removed: proves that a truly
    out-of-scope submodule (different top-level dir) is still removed.

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8
@spoorcc

spoorcc commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
dfetch/log.py (1)

26-26: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Remove inline type suppression and fix the underlying typing mismatch.

The inline # type: ignore[misc] on _NoExpandRichHandler violates the repository rule for production code suppressions.

Suggested fix
-class _NoExpandRichHandler(RichHandler):  # type: ignore[misc]
+class _NoExpandRichHandler(RichHandler):

As per coding guidelines, “Avoid inline lint suppressions (# noqa, # type: ignore, # pylint: disable, # pyright: ignore) without fixing the root cause; exception permitted for module-level tool headers (lines 1–5) in test files.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/log.py` at line 26, Remove the inline "# type: ignore[misc]" on the
class declaration for _NoExpandRichHandler and fix the underlying typing
mismatch: make _NoExpandRichHandler(RichHandler) conform to RichHandler's type
signature by adjusting the class' base/generic parameters or by adding the
missing/incorrect type annotations on its methods (e.g., __init__, emit, or any
overridden methods) so the subclass matches the parent's expected signatures;
ensure imports for any typing primitives (TypeVar, Generic, Any, Optional) are
added and not suppressed so the typechecker accepts the class without an inline
ignore.

Source: Coding guidelines

dfetch/vcs/git.py (1)

452-456: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Top-level deletion can still remove the selected src tree.

to_remove.add(Path(submodule.path).parts[0]) on Line 454 is unsafe when excluded submodules share the same top-level directory as src (e.g., apps/other with src=apps/myapp), because deleting apps removes in-scope content before _move_src_folder_up.

Suggested fix
             else:
                 if Path(src).is_relative_to(Path(submodule.path)):
                     continue
-                to_remove.add(Path(submodule.path).parts[0])
+                src_top = Path(src).parts[0] if Path(src).parts else ""
+                sub_path = Path(submodule.path)
+                sub_top = sub_path.parts[0] if sub_path.parts else ""
+                to_remove.add(submodule.path if sub_top == src_top else sub_top)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/vcs/git.py` around lines 452 - 456, The current logic collects only
the top-level directory (Path(submodule.path).parts[0]) which can delete a
sibling tree that contains the target src (e.g., apps); instead, collect and
remove the exact submodule paths and not their top-level parent: change the
to_remove set to hold full Path(submodule.path) values (keep the existing check
using src and submodule.path) and call safe_rm on each full submodule path (not
the top-level part), referencing variables/functions submodule.path, src,
to_remove, Path(), safe_rm and the surrounding _move_src_folder_up flow so the
removals only delete the precise submodule dirs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@dfetch/log.py`:
- Line 26: Remove the inline "# type: ignore[misc]" on the class declaration for
_NoExpandRichHandler and fix the underlying typing mismatch: make
_NoExpandRichHandler(RichHandler) conform to RichHandler's type signature by
adjusting the class' base/generic parameters or by adding the missing/incorrect
type annotations on its methods (e.g., __init__, emit, or any overridden
methods) so the subclass matches the parent's expected signatures; ensure
imports for any typing primitives (TypeVar, Generic, Any, Optional) are added
and not suppressed so the typechecker accepts the class without an inline
ignore.

In `@dfetch/vcs/git.py`:
- Around line 452-456: The current logic collects only the top-level directory
(Path(submodule.path).parts[0]) which can delete a sibling tree that contains
the target src (e.g., apps); instead, collect and remove the exact submodule
paths and not their top-level parent: change the to_remove set to hold full
Path(submodule.path) values (keep the existing check using src and
submodule.path) and call safe_rm on each full submodule path (not the top-level
part), referencing variables/functions submodule.path, src, to_remove, Path(),
safe_rm and the surrounding _move_src_folder_up flow so the removals only delete
the precise submodule dirs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: caf8e1b0-c9fd-4e5f-8f2f-8871a8523c22

📥 Commits

Reviewing files that changed from the base of the PR and between 36dd882 and 84c6b76.

📒 Files selected for processing (3)
  • dfetch/log.py
  • dfetch/vcs/git.py
  • tests/test_git_vcs.py

… handle empty parents and .git collision

The previous fix used Path(submodule.path).parts[0] in to_remove, which for a
sibling submodule (e.g. apps/widget when src=apps/lib) would add the shared
top-level dir 'apps' to to_remove and call safe_rm('apps'), destroying the
already-cloned apps/lib content before _move_src_folder_up could promote it.

Changes:
- vcs/git.py: _filter_submodules_by_src now adds the exact submodule.path to
  to_remove instead of its first path component, so only the precise out-of-scope
  submodule directory is deleted.

- vcs/git.py: add _remove_empty_parents static helper that walks up from each
  removed path and rmdir()s any ancestor directories that become empty.  git
  submodule update creates parent dirs (e.g. other_ext/) for submodules excluded
  by sparse-checkout; the old top-level removal cleaned those up for free, so
  the helper restores that behaviour without the sibling-clobbering side-effect.

- vcs/git.py: _apply_move now pre-removes .git and .gitmodules from the root of
  the chosen directory before promoting its contents to the repo root.  When src
  is itself a cloned submodule (e.g. src: apps/lib), the submodule's .git file
  would collide with the parent repo's .git directory; removing it beforehand
  avoids the shutil.Error.  The caller (gitsubproject) already cleans up these
  files recursively after checkout completes.

- tests/test_git_vcs.py: update test_filter_submodules_disjoint_submodule_removed
  to assert on the exact submodule path (not its parent); add new
  test_filter_submodules_sibling_of_src_not_removed which proves the bug — with
  the old code safe_rm('apps') would remove apps/lib/ before promotion, so
  README.md would never reach root.

- features/fetch-git-repo-with-submodule.feature: add end-to-end scenario
  "A sibling submodule at the same top-level dir as src is not fetched" that
  fails under the old top-level removal logic and passes with the fix.

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8
The pre-commit mypy binary runs in an isolated uv-tool venv that lacks
rich, so RichHandler resolved to Any and triggered [misc] on subclassing.
Move the suppression to a targeted per-module mypy override
(disallow_subclassing_any = false for dfetch.log) so the inline comment
is gone and the setting is explicit in configuration.

https://claude.ai/code/session_01KjAnpdH8M33gT1yZPraVx8
@spoorcc

spoorcc commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@ben-edna ben-edna merged commit c25a734 into main Jun 12, 2026
36 checks passed
@ben-edna ben-edna deleted the claude/subproject-submodules-fetch-e2y7d2 branch June 12, 2026 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants